Improving Similarity Search in Time Series Using Wavelets

نویسندگان

  • Ioannis Liabotis
  • Babis Theodoulidis
  • Mohammad Saraee
چکیده

Sequences constitute a large portion of data stored in databases. Data mining applications require the ability to process similarity queries over a large amount of time series data. The query processing performance is an important factor that needs to be taken into consideration. This article proposes a similarity retrieval algorithm for time series. The proposed approach utilizes wavelet transformation in order to reduce the dimensionality of the time series. The transformed series are indexed using X-Trees, which is a spatial indexing technique able to efficiently index high-dimensional data. The article proves that this technique outperforms the usage of the Fourier transformation, since the wavelet transformation provides better approximation of the time series. Through the experiments, it can be concluded that the optimum performance is obtained using 16 to 20 wavelet coefficients. Furthermore, a novel mechanism for reducing the complexity of the calculation for the false alarms removal is proposed. Storing the approximation coefficients of the penultimate level of the decomposition tree, the Euclidean distance between the two sequences is calculated, thus reducing further the number of false alarms before calculating the actual Euclidean distance using the complete time series. The article concludes with a detailed performance evaluation of the proposed similarity retrieval algorithm using data from the Greek stock market and the temperature measurements from Athens. The comparison is done with techniques that use the Haar transform and the R*-Tree, and the proposed algorithm is shown to outperform them.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity Search Over Time-Series Data Using Wavelets

We consider the use of wavelet transformations as a dimensionality reduction technique to permit efficient similarity search over high-dimensional time-series data. While numerous transformations have been proposed and studied, the only wavelet that has been shown to be effective for this application is the Haar wavelet. In this work, we observe that a large class of wavelet transformations (no...

متن کامل

A combined Wavelet- Artificial Neural Network model and its application to the prediction of groundwater level fluctuations

Accurate groundwater level modeling and forecasting contribute to civil projects, land use, citys planning and water resources management. Combined Wavelet-Artificial Neural Network (WANN) model has been widely used in recent years to forecast hydrological and hydrogeological phenomena. This study investigates the sensitivity of the pre-processing to the wavelet type and decomposition level in ...

متن کامل

Using Wavelets and Splines to Forecast Non-Stationary Time Series

 This paper deals with a short term forecasting non-stationary time series using wavelets and splines. Wavelets can decompose the series as the sum of two low and high frequency components. Aminghafari and Poggi (2007) proposed to predict high frequency component by wavelets and extrapolate low frequency component by local polynomial fitting. We propose to forecast non-stationary process u...

متن کامل

Clinical Decision Support by Time Series Classification Using Wavelets

Clinicians do sometimes need help with diagnoses, or simply need reinsurance that they make the right decision. This could be provided to the clinician in the form of a decision support system. We have designed and implemented a decision support system for the classification of time series. The system is called HR3Modul and is designed to assist clinicians in the diagnosis of respiratory sinus ...

متن کامل

Some New Methods for Prediction of Time Series by Wavelets

Extended Abstract. Forecasting is one of the most important purposes of time series analysis. For many years, classical methods were used for this aim. But these methods do not give good performance results for real time series due to non-linearity and non-stationarity of these data sets. On one hand, most of real world time series data display a time-varying second order structure. On th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJDWM

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2006